Computational screening and molecular docking of compounds from Traditional Chinese Medicine (TCM) by targeting DNA topoisomerase I to design potential anticancer drugs

Each year thousands of people suffer across the globe due to higher cancer incidence and mortality rates. Additionally, the treatment option for cancer patients is also costly, and often cancer drugs suffer from lower efficacy with more side effects. The DNA topoisomerase can function as an established cancer target because Human Topoisomerase (Top1) regulates genetic transcription during the post-mitotic phase and plays a critical role in DNA supercoiling during replication and repair. Therefore, during drug therapy, blocking the Top1 may be crucial for inhibiting the proliferation of cancer cells. Here, the TCM (traditional Chinese medicine) compounds have been screened through the virtual screening. The Chinese medicine library’s virtual screening process made it possible to narrow down the compound list to 29 compounds based on binding energy (-7.1 to -9.3Kcal/mol), while following Lipniski filtering, MM/PB (GB) SA filtering was used to screen the remaining 22 compounds and the top four compounds were chosen based on binding free energy. Here, the four compounds; CID-65752 (T2972: Rutaecarpine), CID-5271805 (T4S2126: Ginkgetin), CID-9817839 (T2S2335: Dehydroevodiamine) and CID-51106 (T3054: Daurisoline) had comparatively higher binding energy of -8.2, -8.5, -8.3 and -8.2 respectively during molecular docking than other compounds. Among these four compounds, no toxic profile of the two screened compounds; CID-5271805 and CID-9817839 was found in ADMET filtering. Moreover, the SASA (solvent accessible surface area), Rg (radius of gyrations), RMSD (root mean square deviation), and RMSF (root mean square fluctuation) profile of the drug-protein complex reveals the stability and rigidity of the compounds in molecular dynamics simulation study. However, these studies need to be validated in experimental approaches to develop more potent and effective cancer drugs.


Introduction
Cancer poses a high risk threat globally where 10 million deaths as well as 19.3 million newly diagnosed have been reported [1].A large proportion of cancer patients are from low-and mid-income countries where higher mortality rate is estimated.The significantly higher cost of the treatment as well as the new drug development process resulted in less affordable treatment option [2].Also, drug resistance towards chemotherapeutic drugs and diverse side effects reduce the efficacy of the cancer drug [3].Therefore, new cancer drug development with better efficacy and a lesser degree of side effects with affordable options requires complex processes, timing, and funding [4].
The DNA topoisomerase plays a vital role in DNA supercoiling during replication as well as the transcription stage and is important for cell division.It can be targeted for antimicrobial and anticancer drug development processes [5].It relaxes DNA supercoiling by reducing torsional stress [6] and initiates cleavage of the phosphodiester bond on the DNA strand forming the 3 0phosphotyrosine linkage between the TYR273 [7].The DNA repair as well as genetic transcription are regulated by Human Topoisomerase (Top1) in post mitotic process.The blockage of the Top1 can be important for the inhibition of cancer cell growth during the drug treatment [8]; and camptothecin [9].Also, different analogs of the camptothecin [10] have been developed; for example, Topotecan, and Irinotecan [11].However, they possess multiple side effects [12] and drug resistance [13].For example, Camptothecin is chemically unstable and associated with the overexpression of ABCB1 and ABCG2 which lead to cross-resistance [14][15][16].
The computer-aided drug designing and virtual screening procedure dramatically reduce the time and complex procedure by screening readily available or designing new analogs.These approaches can be included in the identification of the drug targets, potential hit identifications from a library [17], and filtering the drug molecules based on pharmacological properties; absorption, distribution, excretion, metabolism, and toxicity [18].The popular drugdesigning approaches include virtual screening, molecular docking, pharmacophore-based designing, molecular dynamics, and QSAR [19].
The structure-based virtual screening procedure is one of the key steps in computer-aided drug designing where screening is conducted by targeting the molecular or biologically relevant target and drugs.This process can be divided into two major groups; ligand-based approaches; where drugs are developed and designed based on the similarity of the drugs or compound; and structure-based virtual screening where the primary focus is given the structure of the protein molecules [20].The screened database can be also very diverse, natural compounds, synthetic compounds library, and drug repurposing library [21].Numerous drugs have come into the markets from virtual screening procedures [22]; captopril, ritonavir, tirofibanan, and saquinavir [23].
In this research approach, the Chinese compound list was virtually screened against DNA topoisomerase protein.The best-selected compounds were further screened through MM-GBSA and ADMET filtering to obtain more accuracy.The dynamics simulation approaches were also employed to understand the stable nature of the DNA topoisomerase and compound complexes.

Protein and ligand preparation
The crystal structure of DNA topoisomerase I was retrieved from the Protein Data Bank (PDB:1T8I) [24].The protein structure was cleaned in Pymol [25] to remove the water molecules, and hetero atoms.The GROMOS 96 43B1 force field was used to optimize the Protein structure [26].The traditional Chinese medicine Library was prepared from 800 Chinese traditional medicines contain 2390 monomer compounds.This compound list includes various structural types such as flavonoids, alkaloids, terpenoids, and glycosides.Traditional Chinese Medicine library is a powerful library in the fields of anti-cancer, anti-inflammatory, antibacterial, apoptosis, and autophagy research [27].

Lipniski rule of five filtering
The lipniski rule of five determines the chemical compounds having certain pharmacological and biological properties which may lead to active drug in human.The Lipniski rule states that active drugs follow these mentioned criteria; (1) hydrogen bond donor less than 5, (2) less than 10 hydrogen bond acceptor, (3) less than 500 dalton molecular weight, (4) calculated octanol-water partition coefficient that not exceed 5, (5) number of rotatable bond less than 10.The top screened compounds were further filtered through Lipnsiki rule of five and the compounds which violates more than one rule were excluded [31].

MMGBSA and MMPBSA
The screened compound list was further screened by MM/PB(GB)SA methods where built in docking program was used to generate the ligand binding pose.The calculation and decomposition of binding free energy for each energy minimized protein ligand binding conformation was done.The receptor force field was set as ff19SB with the OPC water model and the ligand force field was set as GAFF2 with truncation radius of 8Å which retained the protein residues within 8Å of all ligand binding pose for rescoring [32].

ADMET
The absorption, distribution, metabolism, excretion, and toxicity (ADMET) were calculated in PKCSM [33].The canonical smiles of the top four ligand molecules were used as entries for the ADMET calculation.

Molecular dynamics
The YASARA dynamics software package was utilized to perform molecular dynamics simulation [34] where the force field was set as AMBER14.The drug protein complexes were optimized before the simulation along with hydrogen bond orientation and cleaning.The TIP3P solvation model was used to create the cubic simulation cell.The physiological parameter of the dynamics of the drug-protein complex was set as 300K temperature, 7.4 pH, and 0.9% NaCl [35].The Particle Mesh Ewald (PME) method was applied to calculate the long range electrostatic interaction by a cut off radius of 8 Å.The simulation cell box was set 20 Å larger than the drug-protein complex to allow the free motion.The energy minimization was conducted by the steepest gradient algorithms (5000 cycles) by simulated annealing method.The simulation was performed using an NPT ensemble for relaxation & minimization of the system, where temperature and pressure were maintained at 300 K and 1 atm, respectively, using the Berendsen thermostat and barostat.The time step of the simulation was set as 1.25fs and chemical bond involving the hydrogen bonds was fixed using SHAKE algorithms [36].The simulation trajectories were saved after 100ps time and extended for 100ns times.The simulation trajectories was utilised to calculate the root mean square deviation (RMSD), root mean square fluctuation (RMSF), radius of gyration (Rg) and solvent accessible surface area [37,38].The MM-PBSA was calculated with the MDs trajectories.The MM-PBSA was implemented by using following equation; Binding Energy = EpotRecept+EsolvRecept+EpotLigand+EsolvLigand−EpotComplex −EsolvComplex

Virtual screening
The virtual screening of the Chinese medicine library allowed to screen the compound list into 29 compo based on the binding energy where more negative energy indicates more favourable interactions.The top 29 compounds demonstrated binding energy from -7.1 to -9.3Kcal/mol ranges (Table 1).
Then the Lipniski filtering was applied to screen the compound list where violations of more than one Lipniski rule were considered especially the violations in molecular weight ranges and hydrogen bond donor and number of rotatable bonds were observed.The final 22 compounds after Lipniski filtering were filtered through MM/PB (GB) SA scoring.The MM/PB (GB) SA filtering allows to select top four compounds based on binding free energy (Table 2).

Binding interaction of the top compounds
The top 4 compounds and their binding residues while interacting with the target topoisomerase protein were retrieved from Pymol and Discovery Studio Software Package (Fig 1).
The CID-65752 and DNA topoisomerase complex had multiple Pi-Cation interactions at LYS525, GLU356, and GLU418 position and also one Pi-alkyl interaction was observed at LYS374 position.The CID-5271805 and DNA Topoisomerase complex was stabilized by five hydrogen bonds at GLU494, SER534, ARG364, ARG488, and THR501 residues.Also, this complex had one pi-cation bond at Lys493 and one pi-alkyl interaction at LYS532 residues.The CID-9817839 and protein complex showed two hydrogen bonds at ASP533, THR501, and one Pi-cation at LYS493, one alkyl at ARG364 and one Pi-Alkyl at LYS532 residues.The CID-51106 and DNA topoisomerase complex had five hydrogen bonds at ARG349, SER432, ARG749, LYS746, and LYS347 residues.It also had one alkyl interaction at ILE756, and two Pi-Sigma interaction at ALA753, and LYS202 (Table 3).

ADMET
The water solubility of compounds reflects the solubility of the molecules at 298K temperature.The lipid-soluble ones are less well absorbed than the water-soluble ones.The T4S2126 and T2S2335 had water solubility ranges from -2.92 to -3.6.The higher Caco2 permeability is measured as a value more than 0.90 where T2S2335 and T2972 had more Caco2 permeability compared to other compounds.The ability of the drug molecules to pass the blood-brain barrier is a crucial parameter to alleviate the side effects and toxicities to enhance drug efficacy.The BBB permeability >0.3 indicates the readily cross the blood-brain barrier and logBB <-1 indicates the poor distribution to the brain.The T2972 and T2S2335 had more logBB values than readily distributed thresholds (Table 4).The AMES and hepatotoxicity indicate that T2972 and T3054 had a probability of being toxic, and were excluded for further downstream analysis.

Molecular dynamics simulation
The RMSD or root mean square deviation of the simulation trajectories defines the stable nature of the drug-protein complex.Fig 2A ) indicates that initially, the drug-protein in the dynamics system had a higher RMSD trend which represents the more mobile nature of the drug-protein system at the initial phase of the simulation.Every complex became stable after 30ns and maintained the stability of the complex till 100ns periods.The CID-9817839, and CID-5271805 complexes had higher RMSD when compared to other complexes which indicate comparatively less stability than other complexes.The overall RMSD of the compounds were below 2.5Å which defines the overall more stable nature of the complexes.This RMSD trend also demonstrates the less flexible nature of the complex.The SASa or solvent-accessible surface area of the drug-protein complex from the simulation trajectories represents the surface area changes while higher SASA relates with the extension of the surface area and lower SASA defines the truncated nature of the complexes.Fig 2B ) indicates the CID-5271805 complex had a higher SASA value which indicates the complexes had a loose packaging system during the simulation periods.As a result, the complexes had a more flexible nature while interacting with the complexes.The rest of the complexes had stable SASA trends till 100 ns periods.
The radius of gyration of the complexes indicates the mobile nature of the complexes.Fig 2C ) indicates that the CID-9817839 and CID-5271805 complexes had higher Rg profiles.This higher Rg profile of those complexes shows the higher flexible nature of the complex.The hydrogen bond of the complexes defines the stability of the complexes where Fig 2D ) indicates the complexes had less fluctuating hydrogen bonding trend during simulation.The hydrogen bond trajectories were stable across the simulation time.The RMSF or root mean square fluctuation of the amino acid residues of the protein complex represents the changes and flexible nature of the amino acid residues in a protein complex.Fig 2E ) indicates the complexes had lower RMSF than 2.5Å except few residues.The lower RMSF profile for the maximum residues indicates the stable nature of the complexes.Also, binding free energy was calculated from the MM-PBSA approach from simulation trajectories of YASARA.The more positive energy from MM-PBSA from the YASARA

Discussion
The computer-aided drug design illustrates a new era by providing lesser time and cost as well as a labouring process which makes this method more feasible.It has become an important tool for designing drug analogues and new drug development.Also, it provides a combination of experimental and computational frameworks to accelerate the drug discovery process.The integrated virtual screening, molecular docking, ADMET, and dynamics simulation approaches can aid in shortlisting the most biologically relevant and active compounds [39][40][41][42].The computer-aided drug designing can target specific molecules by using the structural information's and the nature of their binding pattern [43][44][45].
Traditional Chinese medicine (TCM) plays an important role in the treatment of patients for long periods in China and Asian countries.The rich phytochemical content of those plant species; tannin, alkaloid, flavonoids, and other content can be utilized to use drug development with higher efficacy [46].To date multiple drugs have been developed from TCM for example; artemisinin which is found majorly in Artemisia carvifolia known to treat acute leukemia [47], chemotherapeutic drug taxol [48], cardio-protective drug Danshensu [49], and salvicine for treating the solid tumor.Although recent studies show that some TCM compounds have toxic effects [49], systemic investigation of TCM compounds requires to develop of therapeutic drugs from TCM compounds with fewer side effects and toxicity [50].The virtual screened compounds in this study consist of rich diversity of flavonoids, alkaloids, terpenoids, and glycosides [51].These herbal drugs possess multiple evidence of uses in drug development and design with less toxicity [52].
The Lipinski rule of five plays an important role in determining drug-likeness screening.It helped to additionally screen 7 compounds in the virtual screening process.The CID-65752 and DNA topoisomerase complex interaction reveals the interaction at GLU356 and LYS374 residues where residue contact was also observed for the crystal structure of DNA  topoisomerase.The multiple interactions at active site points indicate the strong binding nature of the complex.This compound is also reported to inhibit hyperplasia in rat model [53], also improve cognitive function by improving mitochondrial function [54], treatment of liver diseases [55], and function against pathogenic fungi [55].The CID-5271805 and DNA Topoisomerase had multiple interactions at ARG364, ARG488, THR501, LYS493, and LYS532 residues where the crystal structure had similar interaction sites.The CID-5271805 also possesses a diverse range of effects on the cellular system; anti-inflammatory activity and anti-apoptotic effect [56], promotes M2 polarisation of microglia through PPARγ signalling pathway, and inhibits neuroinflammation [56].The CID-9817839 and protein complex had similar interactions at ASP533, THR501, LYS493, ARG364, and LYS532 residues like crystal structures.The CID-9817839 is involved in inhibiting metastasis [57], suppressing the inflammatory response in rat and human systems [58], improving stress-induced memory impairment [59], and enhancing cognitive function.The CID-51106 and DNA topoisomerase complex had two interaction points similar to the crystal structure at LYS746 and LYS347.Also, the compound had multiple hydrogen bonds, where the interaction or presence of multiple hydrogen bonds indicates the strong binding nature of the complexes.The CID-51106 inhibits esophageal squamous cell carcinoma growth in vitro and in vivo conditions [60], inhibits tumor angiogenesis [61], and suppresses lung cancer tumorigenesis [62].
The molecular dynamics study of the complex reveals that the complexes had exhibited stability across the simulation trajectories and RMSD, RMSF, SASA, and Rg.The lower toxicity and drug-likeness of the compounds were also revealed in ADMET screening, where T2972 and T3054 had been excluded due to the probability of being toxic.Finally selected two screened compounds; CID-5271805 (T4S2126: Ginkgetin) and CID-9817839 (T2S2335: Dehydroevodiamine) had also important roles in cellular activity and development other drugs development that have been represented in the discussion.This study represents that the TCM compound can be used to design and develop the cancer drug, however, validation from the in vitro and in vivo conditions is required for further assessment.

Fig 1 .
Fig 1.Molecular docking study of DNA topoisomerase I with top four active compounds from Traditional Chinese Medicine.Here, a) indicates the surface view, cartoon shape, and 2D view of CID-65752 and DNA topoisomerase I; b) indicates the interaction of CID-5271805 and DNA topoisomerase I; c) represents three different view of CID-9817839 and DNA topoisomerase I; d) represents three different view of CID-51106 and DNA topoisomerase I. https://doi.org/10.1371/journal.pone.0310364.g001

Fig 2 .Fig 3 .
Fig 2. The molecular dynamics simulations study of the DNA topoisomerase I. a) the root mean square deviations, b) solvent accessible surface area, c) hydrogen bond, d) radius of gyrations, e) the root mean square fluctuations of the complexes.https://doi.org/10.1371/journal.pone.0310364.g002

Table 2 . Binding free energy filtering of 22 compounds by MM/PB(GB)SA method.
https://doi.org/10.1371/journal.pone.0310364.t002algorithm indicates more favourable binding.Fig 3 indicates that CID-65752 had higher energy in MM-PBSA compared to other complex which indicates more favourable binding of that complex.